AITopics | augmentation process

Collaborating Authors

augmentation process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sample-Efficient Multi-Round Generative Data Augmentation for Long-Tail Instance Segmentation

Neural Information Processing SystemsJun-12-2026, 14:50:27 GMT

Data synthesis has become increasingly crucial for long-tail instance segmentation tasks to mitigate class imbalance and high annotation costs. Previous methods have primarily prioritized the selection of data from a pre-generated image object pool, which frequently leads to the inefficient utilization of generated data.

artificial intelligence, machine learning, proceedings, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.58)

Add feedback

Explanatory Debiasing: Involving Domain Experts in the Data Generation Process to Mitigate Representation Bias in AI Systems

Bhattacharya, Aditya, Stumpf, Simone, De Croon, Robin, Verbert, Katrien

arXiv.org Artificial IntelligenceDec-26-2024

Representation bias is one of the most common types of biases in artificial intelligence (AI) systems, causing AI models to perform poorly on underrepresented data segments. Although AI practitioners use various methods to reduce representation bias, their effectiveness is often constrained by insufficient domain knowledge in the debiasing process. To address this gap, this paper introduces a set of generic design guidelines for effectively involving domain experts in representation debiasing. We instantiated our proposed guidelines in a healthcare-focused application and evaluated them through a comprehensive mixed-methods user study with 35 healthcare experts. Our findings show that involving domain experts can reduce representation bias without compromising model accuracy. Based on our findings, we also offer recommendations for developers to build robust debiasing systems guided by our generic design guidelines, ensuring more effective inclusion of domain experts in the debiasing process.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.01441

Country:

Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
North America > United States > New York > New York County > New York City (0.05)
(17 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Health & Medicine > Consumer Health (0.68)
Health & Medicine > Diagnostic Medicine (0.68)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Human Computer Interaction > Interfaces (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

PolyIPA -- Multilingual Phoneme-to-Grapheme Conversion Model

Lauc, Davor

arXiv.org Artificial IntelligenceDec-12-2024

This paper presents PolyIPA, a novel multilingual phoneme-to-grapheme conversion model designed for multilingual name transliteration, onomastic research, and information retrieval. The model leverages two helper models developed for data augmentation: IPA2vec for finding soundalikes across languages, and similarIPA for handling phonetic notation variations. Evaluated on a test set that spans multiple languages and writing systems, the model achieves a mean Character Error Rate of 0.055 and a character-level BLEU score of 0.914, with particularly strong performance on languages with shallow orthographies. The implementation of beam search further improves practical utility, with top-3 candidates reducing the effective error rate by 52.7\% (to CER: 0.026), demonstrating the model's effectiveness for cross-linguistic applications.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.09102

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Croatia > Zagreb County > Zagreb (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

Fair Augmentation for Graph Collaborative Filtering

Boratto, Ludovico, Fabbri, Francesco, Fenu, Gianni, Marras, Mirko, Medda, Giacomo

arXiv.org Artificial IntelligenceAug-22-2024

Recent developments in recommendation have harnessed the collaborative power of graph neural networks (GNNs) in learning users' preferences from user-item networks. Despite emerging regulations addressing fairness of automated systems, unfairness issues in graph collaborative filtering remain underexplored, especially from the consumer's perspective. Despite numerous contributions on consumer unfairness, only a few of these works have delved into GNNs. A notable gap exists in the formalization of the latest mitigation algorithms, as well as in their effectiveness and reliability on cutting-edge models. This paper serves as a solid response to recent research highlighting unfairness issues in graph collaborative filtering by reproducing one of the latest mitigation methods. The reproduced technique adjusts the system fairness level by learning a fair graph augmentation. Under an experimental setup based on 11 GNNs, 5 non-GNN models, and 5 real-world networks across diverse domains, our investigation reveals that fair graph augmentation is consistently effective on high-utility models and large datasets. Experiments on the transferability of the fair augmented graph open new issues for future recommendation studies. Source code: https://github.com/jackmedda/FA4GCF.

augmentation process, graph, recommendation, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3640457.3688064

2408.12208

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.14)
Europe > Italy > Apulia > Bari (0.05)
(27 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)

Industry:

Leisure & Entertainment (0.93)
Media > Music (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Prompt Selection and Augmentation for Few Examples Code Generation in Large Language Model and its Application in Robotics Control

Wu, On Tai, Chan, Frodo Kin Sun, Zhang, Zunhao, Law, Yan Nei, Drescher, Benny, Lai, Edmond Shiao Bun

arXiv.org Artificial IntelligenceMar-11-2024

Few-shot prompting and step-by-step reasoning have enhanced the capabilities of Large Language Models (LLMs) in tackling complex tasks including code generation. In this paper, we introduce a prompt selection and augmentation algorithm aimed at improving mathematical reasoning and robot arm operations. Our approach incorporates a multi-stage example augmentation scheme combined with an example selection scheme. This algorithm improves LLM performance by selecting a set of examples that increase diversity, minimize redundancy, and increase relevance to the question. When combined with the Program-of-Thought prompting, our algorithm demonstrates an improvement in performance on the GSM8K and SVAMP benchmarks, with increases of 0.3% and 1.1% respectively. Furthermore, in simulated tabletop environments, our algorithm surpasses the Code-as-Policies approach by achieving a 3.4% increase in successful task completions and a decrease of over 70% in the number of examples used. Its ability to discard examples that contribute little to solving the problem reduces the inferencing time of an LLM-powered robotics system. This algorithm also offers important benefits for industrial process automation by streamlining the development and deployment process, reducing manual programming effort, and enhancing code reusability.

arxiv, augmentation, language model, (16 more...)

arXiv.org Artificial Intelligence

2403.12999

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Neural Network Architecture for Database Augmentation Using Shared Features

Sleeman, William C. IV, Kapoor, Rishabh, Ghosh, Preetam

arXiv.org Artificial IntelligenceFeb-2-2023

The popularity of learning from data with machine learning and neural networks has lead to the creation of many new datasets for almost every problem domain. However, even within a single domain, these datasets are often collected with disparate features, sampled from different sub-populations, and recorded at different time points. Even with the plethora of individual datasets, large data science projects can be difficult as it is often not trivial to merge these smaller datasets. Inherent challenges in some domains such as medicine also makes it very difficult to create large single source datasets or multi-source datasets with identical features. Instead of trying to merge these non-matching datasets directly, we propose a neural network architecture that can provide data augmentation using features common between these datasets. Our results show that this style of data augmentation can work for both image and tabular data.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2302.01374

Country: North America > United States > Virginia (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Predicting Survival Outcomes in the Presence of Unlabeled Data

Haredasht, Fateme Nateghi, Vens, Celine

arXiv.org Artificial IntelligenceOct-25-2022

Many clinical studies require the follow-up of patients over time. This is challenging: apart from frequently observed drop-out, there are often also organizational and financial challenges, which can lead to reduced data collection and, in turn, can complicate subsequent analyses. In contrast, there is often plenty of baseline data available of patients with similar characteristics and background information, e.g., from patients that fall outside the study time window. In this article, we investigate whether we can benefit from the inclusion of such unlabeled data instances to predict accurate survival times. In other words, we introduce a third level of supervision in the context of survival analysis, apart from fully observed and censored instances, we also include unlabeled instances. We propose three approaches to deal with this novel setting and provide an empirical comparison over fifteen real-life clinical and gene expression survival datasets. Our results demonstrate that all approaches are able to increase the predictive performance over independent test data. We also show that integrating the partial supervision provided by censored data in a semi-supervised wrapper approach generally provides the best results, often achieving high improvements, compared to not using unlabeled data.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2210.13891

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Norway (0.04)
Asia (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback